com.doclinx.ftxml
Class CatalogSearch

java.lang.Object
  |
  +--com.doclinx.ftxml.CatalogSearch

public final class CatalogSearch
extends java.lang.Object

This class is the API for the search and retrieval component of TeraXML. The class accesses a catalog created and managed by the CatalogManager class. The catalog can be searched using a boolean query language with results being organized by several sort criteria.

See Also:
CatalogManager

Field Summary
 boolean bObsolete
          Set if the CatalogSearch handle has been notified of DB update.
static int FOLD_DIGIT
          Folding control bits for numerics and/or latin characters.
static int FOLD_LATIN
          CatXSSearch mode setting.
static int MODE_ALTWORD
          Include lookup word in alt. lookup str
static int MODE_ANDORMODE
          Search Modes for catXSSearch method mode parameter Search Operation.
static int MODE_BOTH
          Search both PRIMARY and UPDATE databases.
static int MODE_FOLD_DIGIT
          Search Modes for catXSSearch method mode parameter.
static int MODE_FOLD_LATIN
          Search Modes for catXSSearch method mode parameter.
static int MODE_FUZZY
          Perform fuzzy searches (sounds like).
static int MODE_NODEINFO
          Search Mode for catXSSearch method mode parameter.
static int MODE_NOXPATHOK
          Search Modes for catXSSearch method mode parameter Enable/Disable error return when xpath not found.
static int MODE_PHRASES
          Search Mode for catXSSearch method mode parameter.
static int MODE_PLURAL
          Search for plurals and possessives.
static int MODE_PRM
          Search only PRIMARY database.
static int MODE_RELEVANCY
          Search Modes for catXSSearch method mode parameter Search Order.
static int MODE_STEM
          Perform stemmed searches (word folding)
static int MODE_SUBDELS
          Search Modes for catXSGetDocMax method mode parameter.
static int MODE_THES
          Use theasurus to expand search terms.
static int MODE_UPD
          Search only UPDATE database.
static int MODE_USE_PREVIOUS_SEARCH
          Search Modes for catXSSearch method mode parameter Use the document set established by the previous search.
static int XS_ALLOCERROR
          Catalog Search Error Codes
static int XS_BADHANDLE
          Catalog Search Error Codes
 com.doclinx.ftxml.CatalogManager xs_catalog
          Internal Use Only
static int XS_CATERROR
          Catalog Search Error Codes
static int XS_CATOPEN
          Catalog Search Error Codes
static int XS_CLOSERROR
          Catalog Search Error Codes
static int XS_CONTEXTERROR
          Catalog Search Error Codes
static int XS_DBOPEN
          Catalog Search Error Codes
 long xs_docs
           
static int XS_ERROR
          Catalog Search Error Codes
 com.doclinx.ftxml.Thes[] xs_fuzzy
          Internal Use Only
static int XS_FUZZYOPEN
          Catalog Search Error Codes
 long xs_hits
           
static int XS_INITERROR
          Catalog Search Error Codes
static java.lang.String XS_LOG_FILENAME
          Log file name
static int XS_LOGICERROR
          Catalog Search Error Codes
static int XS_NODB
          Catalog Search Error Codes
static int XS_NODOCS
          Catalog Search Error Codes
static int XS_NOHITDATA
          Catalog Search Error Codes
static int XS_NOPRIMARY
          Catalog Search Error Codes
static int XS_NORESULTS
          Catalog Search Error Codes
static int XS_NOUPDATE
          Catalog Search Error Codes
static int XS_OBSOLETE
          Catalog Search Error Codes
static int XS_PARMERROR
          Catalog Search Error Codes
static int XS_PATHNOTFOUND
          Catalog Search Error Codes
static int XS_PRM
          Indicator for primary portions of a catalog.
static int XS_RELTAG
          Catalog Search Error Codes
static int XS_RSERROR
          Catalog Search Error Codes
static int XS_RSQUERY
          Catalog Search Error Codes
static int XS_RSSEARCH
          Catalog Search Error Codes
static int XS_SELECTERROR
          Catalog Search Error Codes
 com.doclinx.ftxml.Thes[] xs_stem
          Internal Use Only
static int XS_STEMOPEN
          Catalog Search Error Codes
static int XS_SUCCESS
          Catalog Search Error Returns.
 com.doclinx.ftxml.Thes xs_thes
          Internal Use Only
static int XS_THESOPEN
          Catalog Search Error Codes
static int XS_UNKNOWN
          Catalog Search Error Codes
static int XS_UPD
          Indicator for update portions of a catalog.
static int XS_WORDEXISTS
          Catalog Search Error Codes
 
Constructor Summary
CatalogSearch()
          API: Constructor for CatalogSearch Object.
 
Method Summary
static void catXSClearCache(java.lang.String sPath)
           
static void catXSClearEntireCache()
           
 void catXSClearSearch(boolean reload)
          API: Clears buffered search hits -- this is to reduce memory use after a search AND building of document search result list.
 void catXSClose()
          API: Closes CatalogSearch object and releases any resources.
 java.lang.String catXSContextXPath(int docId, char contextId)
          API: Return the string denoting the XPATH where a match (hit) occurred.
static void catXSFinish()
          API: Close all outstanding CatalogSearch objects that are OPEN.
 com.doclinx.jftr.FTR catXSFTRInfo(com.doclinx.jftr.RS_DBH[] dbh)
          API: Returns low-level FTR handle.
 java.lang.String catXSGetAlts(int mode, java.lang.String symbol)
          API: Retrieve alternate word list via supported methods.
 com.doclinx.ftxml.CatalogManager catXSGetCatalog()
          API: Return CatalogManager handle associated with this search object.
 com.doclinx.jftr.DOCHIT catXSGetDoc(long index)
          API: Retrieves one of the document number (catalog entry) and total number of matches from a set of search results.
 long catXSGetDocCount()
          API: Retrieve the number of documents found in the last search.
 com.doclinx.jftr.DOCHIT[] catXSGetDocList(int start, int nItems)
          API: Retrieves many of the document number (catalog entry) and total number of matches from a set of search results.
 long catXSGetDocMax()
          API: Retrieve the current active number of documents in the catalog.
 long catXSGetDocMax(int mode)
          API: Retrieve the maximum Document ID (catalog entry #) for the primary, update, or both databases.
 int catXSGetExtendedError()
          API: Return extended error code, refining errors for query or search.
 java.lang.String catXSGetHF(long docId)
          API: Create PDF Highlight File Format for a given document hit.
 long catXSGetHitCount()
          API: Retrieve the total number of hits matching the last search query.
 int catXSGetHitsReturned()
          API: Returns the number of hits from last search AND then releases search result objects.
 com.doclinx.jftr.DOCHIT[] catXSGetRelevancyList(int start, int nItems)
          API: Retrieves many of the document # (catalog entry) and relevancy # from a set of search results.
 java.util.Vector catXSGetSearchTerms()
          API: Retrieves symbol and other data from boolean query as Vector of String or NodeInfo objects.
 long catXSGetSearchTime()
          API: Retrieve time required for last search (in milleseconds).
 com.doclinx.jftr.DOCHIT[] catXSGetSortedDocList(int start, int nItems, int key)
          API: Retrieves a selected # of document hits in sorted order; the document hit set may be sorted by any item in a catalog entry.
 com.doclinx.jftr.RDOCHIT[] catXSGetSortedDocList(int start, int nItems, java.lang.String[] attrs)
          API: Retrieves a selected # of document hits in sorted order; the document hit set can be sorted by an entry's set of metadata attributes.
 java.util.Vector catXSHitList(int[] aaList, int start, int nItems)
          API: Retrieve the hit list(word location) Vector resulting from a search.
 boolean catXSIsObsolete()
          API: Checks to to if search catalog has been updated.
static void catXSMaxDocList(int maxSize)
          API: Sets the maximum number of DOCHIT objects for in-memory storage.
 void catXSOpen(java.lang.String sPath, java.lang.String sName)
          API: Open a CatalogSearch object on an existing catalog.
 void catXSSearch(java.lang.String sQuery, int mode)
          API: Search an opened catalog based upon query string and mode of operation.
 void catXSSearchAbort()
          API: Asynchronous abort for current search (running in another thread)
 void catXSSerial(com.doclinx.ftxml.XS_SYNC sync)
           
static void catXSSetLocLimit(int limit)
          API: Sets upper limit for unique words in wildcard or range seaches.
 void catXSSetLog(com.doclinx.jftr.Log sLog)
          API: Set logger.
 void catXSSetLogFile(java.lang.String sFileName, int level)
          API: Set logging file and severity level for class error reporting.
 void catXSSetRelevancyTags(int val, java.lang.String tag)
          API: Set multiplier for highly relevant tags (e.g.
 java.util.Vector catXSSpellSuggest(int mode, java.lang.String symbol)
          API: Retrieve alternate word list via supported methods.
 java.util.Vector[] catXSWildLookup(com.doclinx.jftr.DOCHIT[] dh, java.lang.String sep)
          API: Retrieve a vector array of String, each vector element containing the actual word match from a search query containing wildcards one vector per document in the hashtable See CatXSSearch for performing search.
 void catXSWildLookup(java.util.Hashtable ht, java.lang.String sep, java.util.Vector[] v)
          API: Retrieve a vector array of String, each vector element containing the actual word match from a search query containing wildcards one vector per document in the hashtable See CatXSSearch for performing search.
 java.util.Vector catXSWildLookup(long docId, java.lang.String sep)
          API: Retrieve a vector of String, each element containing the actual word match from a search query containing wildcards.
static int DOC(char[] aaList)
          API: Construct DocId from attribute array values.
static int DOC(char doc, char hiDoc)
          API: Construct DocId from attribute array values.
static java.lang.String explain(int iStatus)
          Explains catalog search error (always long version).
static java.lang.String explain(int iStatus, boolean bLong)
          Explains catalog search error.
 void finalize()
          Override default finalize to ensure catalog search handle closed.
 int getMaxRelevancy()
          Returns maximum relevancy value found in last search.
 java.lang.String getName()
           
 java.lang.String getQuery()
           
 void intCatXSClose()
          INTERNAL USE ONLY.
 void intCatXSOpen(com.doclinx.jftr.BLList del)
          INTERNAL USE ONLY.
static boolean PRM(int mode)
          Helper function to determine if MODE_PRM bit set
static void setHCache(boolean set)
           
 void setName(java.lang.String name)
           
static boolean UPD(int mode)
          Helper function to determine if MODE_UPD bit set
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

XS_PRM

public static final int XS_PRM
Indicator for primary portions of a catalog.

See Also:
Constant Field Values

XS_UPD

public static final int XS_UPD
Indicator for update portions of a catalog.

See Also:
Constant Field Values

XS_LOG_FILENAME

public static final java.lang.String XS_LOG_FILENAME
Log file name

See Also:
Constant Field Values

FOLD_DIGIT

public static final int FOLD_DIGIT
Folding control bits for numerics and/or latin characters. See also CatXSSearch mode parameter. See also MODE_PRM for mode settings.

See Also:
Constant Field Values

FOLD_LATIN

public static final int FOLD_LATIN
CatXSSearch mode setting.

See Also:
Constant Field Values

XS_SUCCESS

public static final int XS_SUCCESS
Catalog Search Error Returns.
        XS_SUCCESS      - No error
        XS_ERROR        - General error.
        XS_INITERROR    - Unable to initialize search library (missing DB files).
        XS_PARMERROR    - Bad parameter values (range or null) to search function.
        XS_CATOPEN      - Unable to open CatalogManager handle, bad path or missing catalog.
        XS_DBOPEN       - Unable to open database handle in catalog (.prm or .upd files).
        XS_BADHANDLE    - CatalogSearch object not initialized or opened.
        XS_RSERROR      - Search library error. See error function for more details.
        XS_RSQUERY      - Bad query for catXSSearch. See error function for more details.
        XS_NOUPDATE     - Attempt to search just update database and update does not exist.
        XS_NOPRIMARY    - Attempt to search primary with no index data.
        XS_NORESULTS    - Request for results when no search performed or empty search (e.g. catXSGetDoc method).
        XS_NODB         - Attempt to open database that has not been indexed.
        XS_CLOSERROR    - Error closing CatalogManager object.
        XS_CATERROR     - **deprecated.
        XS_LOGICERROR   - **deprecated.
        XS_ALLOCERROR   - **deprecated.
        XS_CONTEXTERROR - Unable to open context handle.
        XS_THESOPEN     - Unable to open Thesaurus object.
        XS_STEMOPEN     - Unable to open Stemming map object.
        XS_FUZZYOPEN    - Unable to open Fuzzy lookup object.
        XS_SELECTERROR  - Search library error. See error function for more details.
        XS_OBSOLETE     - **deprecated
        XS_NODOCS       - No documents in database. (all deleted).
        XS_RSSEARCH     - Search error (but not query format error).
                          See error function for more details. 
        XS_RELTAG       - Error in tag lookup for relvancy weighting.
        XS_PATHNOTFOUND - No context or context not found for given ID.
        XS_NOHITDATA    - Document not PDF or no hit data collected.

        XS_UNKNOWN      - Unexpected error in CatalogSearch method.
    

See Also:
Constant Field Values

XS_ERROR

public static final int XS_ERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_INITERROR

public static final int XS_INITERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_PARMERROR

public static final int XS_PARMERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_CATOPEN

public static final int XS_CATOPEN
Catalog Search Error Codes

See Also:
Constant Field Values

XS_DBOPEN

public static final int XS_DBOPEN
Catalog Search Error Codes

See Also:
Constant Field Values

XS_BADHANDLE

public static final int XS_BADHANDLE
Catalog Search Error Codes

See Also:
Constant Field Values

XS_RSERROR

public static final int XS_RSERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_RSQUERY

public static final int XS_RSQUERY
Catalog Search Error Codes

See Also:
Constant Field Values

XS_NOUPDATE

public static final int XS_NOUPDATE
Catalog Search Error Codes

See Also:
Constant Field Values

XS_NOPRIMARY

public static final int XS_NOPRIMARY
Catalog Search Error Codes

See Also:
Constant Field Values

XS_NORESULTS

public static final int XS_NORESULTS
Catalog Search Error Codes

See Also:
Constant Field Values

XS_NODB

public static final int XS_NODB
Catalog Search Error Codes

See Also:
Constant Field Values

XS_CLOSERROR

public static final int XS_CLOSERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_CATERROR

public static final int XS_CATERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_LOGICERROR

public static final int XS_LOGICERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_ALLOCERROR

public static final int XS_ALLOCERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_CONTEXTERROR

public static final int XS_CONTEXTERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_THESOPEN

public static final int XS_THESOPEN
Catalog Search Error Codes

See Also:
Constant Field Values

XS_STEMOPEN

public static final int XS_STEMOPEN
Catalog Search Error Codes

See Also:
Constant Field Values

XS_FUZZYOPEN

public static final int XS_FUZZYOPEN
Catalog Search Error Codes

See Also:
Constant Field Values

XS_SELECTERROR

public static final int XS_SELECTERROR
Catalog Search Error Codes

See Also:
Constant Field Values

XS_OBSOLETE

public static final int XS_OBSOLETE
Catalog Search Error Codes

See Also:
Constant Field Values

XS_NODOCS

public static final int XS_NODOCS
Catalog Search Error Codes

See Also:
Constant Field Values

XS_RSSEARCH

public static final int XS_RSSEARCH
Catalog Search Error Codes

See Also:
Constant Field Values

XS_RELTAG

public static final int XS_RELTAG
Catalog Search Error Codes

See Also:
Constant Field Values

XS_PATHNOTFOUND

public static final int XS_PATHNOTFOUND
Catalog Search Error Codes

See Also:
Constant Field Values

XS_NOHITDATA

public static final int XS_NOHITDATA
Catalog Search Error Codes

See Also:
Constant Field Values

XS_WORDEXISTS

public static final int XS_WORDEXISTS
Catalog Search Error Codes

See Also:
Constant Field Values

XS_UNKNOWN

public static final int XS_UNKNOWN
Catalog Search Error Codes

See Also:
Constant Field Values

MODE_PRM

public static final int MODE_PRM
Search only PRIMARY database.

See Also:
Constant Field Values

MODE_UPD

public static final int MODE_UPD
Search only UPDATE database.

See Also:
Constant Field Values

MODE_BOTH

public static final int MODE_BOTH
Search both PRIMARY and UPDATE databases.

See Also:
Constant Field Values

MODE_PLURAL

public static final int MODE_PLURAL
Search for plurals and possessives.

See Also:
Constant Field Values

MODE_FUZZY

public static final int MODE_FUZZY
Perform fuzzy searches (sounds like).

See Also:
Constant Field Values

MODE_STEM

public static final int MODE_STEM
Perform stemmed searches (word folding)

See Also:
Constant Field Values

MODE_THES

public static final int MODE_THES
Use theasurus to expand search terms.

See Also:
Constant Field Values

MODE_ALTWORD

public static final int MODE_ALTWORD
Include lookup word in alt. lookup str

See Also:
Constant Field Values

MODE_ANDORMODE

public static final int MODE_ANDORMODE
Search Modes for catXSSearch method mode parameter Search Operation. Conduct AND operations as follows: For "AND" operations where both terms occur, include ALL terms matching either term within document. Normal AND mode normally includes only the "MIN" term of the two.

See Also:
Constant Field Values

MODE_NOXPATHOK

public static final int MODE_NOXPATHOK
Search Modes for catXSSearch method mode parameter Enable/Disable error return when xpath not found.

See Also:
Constant Field Values

MODE_USE_PREVIOUS_SEARCH

public static final int MODE_USE_PREVIOUS_SEARCH
Search Modes for catXSSearch method mode parameter Use the document set established by the previous search.

See Also:
Constant Field Values

MODE_RELEVANCY

public static final int MODE_RELEVANCY
Search Modes for catXSSearch method mode parameter Search Order. Order results by relevancy ranking.

See Also:
Constant Field Values

MODE_FOLD_DIGIT

public static final int MODE_FOLD_DIGIT
Search Modes for catXSSearch method mode parameter.
  MODE_FOLD_DIGIT: Fold all digit chars to base to 0..9 
  MODE_FOLD_LATIN: Fold Latin accented chars to base 26 letter set

  Character folding modes (Note: Must be > 0x2000)'

See Also:
Constant Field Values

MODE_FOLD_LATIN

public static final int MODE_FOLD_LATIN
Search Modes for catXSSearch method mode parameter.

See Also:
Constant Field Values

MODE_SUBDELS

public static final int MODE_SUBDELS
Search Modes for catXSGetDocMax method mode parameter. Also see values for other mode settings. Note: Mode values for catXSGetDocMax can ONLY include MODE_PRM, MODE_UPD, MODE_BOTH and MODE_SUBDELS

See Also:
Constant Field Values

MODE_NODEINFO

public static final int MODE_NODEINFO
Search Mode for catXSSearch method mode parameter. Return vector of NodeInfo for each query node. (default Symbols)

See Also:
Constant Field Values

MODE_PHRASES

public static final int MODE_PHRASES
Search Mode for catXSSearch method mode parameter. Return vector of Strings (terms AND phrases)

See Also:
Constant Field Values

xs_catalog

public com.doclinx.ftxml.CatalogManager xs_catalog
Internal Use Only


xs_docs

public long xs_docs

xs_hits

public long xs_hits

xs_thes

public com.doclinx.ftxml.Thes xs_thes
Internal Use Only


xs_stem

public com.doclinx.ftxml.Thes[] xs_stem
Internal Use Only


xs_fuzzy

public com.doclinx.ftxml.Thes[] xs_fuzzy
Internal Use Only


bObsolete

public boolean bObsolete
Set if the CatalogSearch handle has been notified of DB update.

Constructor Detail

CatalogSearch

public CatalogSearch()
API: Constructor for CatalogSearch Object.

Method Detail

setName

public void setName(java.lang.String name)

getName

public java.lang.String getName()

getQuery

public java.lang.String getQuery()

getMaxRelevancy

public int getMaxRelevancy()
Returns maximum relevancy value found in last search. Note that this value is only valid after a relevancy sort. See catXSGetSortedDocList on sorting details.

Returns:
int maximum relevancy number.

explain

public static java.lang.String explain(int iStatus)
Explains catalog search error (always long version).

Parameters:
iStatus - CatalogSearch error code.
Returns:
English string explanation of error code.

explain

public static java.lang.String explain(int iStatus,
                                       boolean bLong)
Explains catalog search error.

Parameters:
iStatus - CatalogSearch error code.
bLong - If true, return long version of error.
Returns:
English string explanation of error code.

setHCache

public static void setHCache(boolean set)

catXSClearEntireCache

public static void catXSClearEntireCache()

catXSClearCache

public static void catXSClearCache(java.lang.String sPath)

PRM

public static boolean PRM(int mode)
Helper function to determine if MODE_PRM bit set

Parameters:
mode - Mode field bit to check to see if primary database.
Returns:
true if primary DB bit set.

UPD

public static boolean UPD(int mode)
Helper function to determine if MODE_UPD bit set

Parameters:
mode - Mode field bit to check to see if update database.
Returns:
true if update DB bit set.

finalize

public void finalize()
Override default finalize to ensure catalog search handle closed.

Overrides:
finalize in class java.lang.Object

catXSOpen

public void catXSOpen(java.lang.String sPath,
                      java.lang.String sName)
               throws CatalogSearchException
API: Open a CatalogSearch object on an existing catalog. The catalog location is indicated by the directory path and a sub-directory name.

Parameters:
sPath - Directory path of exisiting catalog group.
sName - Sub-directory name of specific catalog to open.
Throws:
CatalogSearchException - See constant definitions.

intCatXSOpen

public void intCatXSOpen(com.doclinx.jftr.BLList del)
                  throws CatalogSearchException
INTERNAL USE ONLY.

CatalogSearchException

catXSClose

public void catXSClose()
                throws CatalogSearchException
API: Closes CatalogSearch object and releases any resources.

Throws:
CatalogSearchException - See constant definitions.

intCatXSClose

public void intCatXSClose()
                   throws CatalogSearchException
INTERNAL USE ONLY.

CatalogSearchException

catXSSearch

public void catXSSearch(java.lang.String sQuery,
                        int mode)
                 throws CatalogSearchException
API: Search an opened catalog based upon query string and mode of operation.
 See TeraXML Query Syntax for query format specification.

Parameters:
sQuery - Simple boolean query string for specifying search.
mode - Specifies which DB to search and other search options. See MODE_PRM for start of discussion about search modes and options.
Throws:
CatalogSearchException - See constant definitions.

catXSGetSearchTerms

public java.util.Vector catXSGetSearchTerms()
                                     throws CatalogSearchException
API: Retrieves symbol and other data from boolean query as Vector of String or NodeInfo objects.
 See catXSSearch for details of search function.

Returns:
Returns Vector of search term objects. Note: Vector can contain String or NodeInfo Objects depending upon catXSSearch mode settings. If no special mode setting is used, the default action is to return just the search terms.
See NodeInfo constant setting.
See Phrases constant setting.
Throws:
CatalogSearchException - See constant definitions.

catXSGetDoc

public com.doclinx.jftr.DOCHIT catXSGetDoc(long index)
                                    throws CatalogSearchException
API: Retrieves one of the document number (catalog entry) and total number of matches from a set of search results. A match occurs when terms or sub-terms in a query match a specific document. There may be multiple matches to the query within a single document. Note that the catXSSearch method must be called before performing this method.
 See CatXSSearch for performing search.

Parameters:
index - Indicates which item in the search set to access. Range 1..N, where N is the total number of search results in set.
Returns:
DOCHIT contains document id and number of hits.
Throws:
CatalogSearchException - See constant definitions. See catXSGetDocCount for maximum # hits N.

catXSGetDocList

public com.doclinx.jftr.DOCHIT[] catXSGetDocList(int start,
                                                 int nItems)
                                          throws CatalogSearchException
API: Retrieves many of the document number (catalog entry) and total number of matches from a set of search results. A match occurs when terms or sub-terms in a query match a specific document. There may be multiple matches to the query within a single document. Note that the catXSSearch method must be called before performing this method. This method differs from catXSGetDoc in that an array of DOCHITs is returned, as opposed to a single item.
 See CatXSSearch for performing search.
 See catXSGetDocCount for maximum # hits N.

Parameters:
start - Indicates first item to access in the search set. Range 1..N, where N is the total number of search results in the set.
nItems - The total number of items to retrieve. This number must be less then the remaining items in set, from start index.
Returns:
DOCHIT contains document id and number of hits.
Throws:
CatalogSearchException - See constant definitions.

catXSGetRelevancyList

public com.doclinx.jftr.DOCHIT[] catXSGetRelevancyList(int start,
                                                       int nItems)
                                                throws CatalogSearchException
API: Retrieves many of the document # (catalog entry) and relevancy # from a set of search results. The relevancy number is a ranking value used to order the terms. The set is organized from largest (most relevant) to smallest. Note that the catXSSearch method must be called before performing this method.
 See CatXSSearch for performing search.
 See catXSGetDocCount for maximum # hits N.
 See catXSSetRelevancyTags to set relevancy weights for tags
 and attributes.

Parameters:
start - Indicates first item to access in the search set. Range 1..N, where N is the total number of search results in set.
nItems - The total number of items to retrieve. This number must be less then the remaining items in set, from start index.
Returns:
DOCHIT array where each object contains document id and relevancy # (relevancy value returned in hitCount field).
Throws:
CatalogSearchException - See constant definitions.

catXSGetSortedDocList

public com.doclinx.jftr.DOCHIT[] catXSGetSortedDocList(int start,
                                                       int nItems,
                                                       int key)
                                                throws CatalogSearchException
API: Retrieves a selected # of document hits in sorted order; the document hit set may be sorted by any item in a catalog entry. Sorting is from smallest to largest. String items are sorted lexigraphically. Note that the catXSSearch method must be called before performing this method.
 See CatXSSearch for performing search.
 See catXSGetDocCount for maximum # hits N.

Parameters:
start - Indicates first item to access in the search set. Range 1..N, where N is the total number of search results in set.
nItems - The total number of items to retrieve. This number must be less then the remaining items in set, from start index.
key - The field number to sort. Must be in the range of catalog items (see CRIDS: Catalog Record Component IDs for values). Note that for the LAST field CAT_AUXINFO, the upper 16-bits of key can contain a sub-field value (0..5) for the text components of this field. (Use 0xffff for sub-field to sort by entire item).
Returns:
DOCHIT array where each object contains document id and number of hits.
Throws:
CatalogSearchException - See constant definitions.

catXSGetSortedDocList

public com.doclinx.jftr.RDOCHIT[] catXSGetSortedDocList(int start,
                                                        int nItems,
                                                        java.lang.String[] attrs)
                                                 throws CatalogSearchException
API: Retrieves a selected # of document hits in sorted order; the document hit set can be sorted by an entry's set of metadata attributes. Sort order can be ascending or descending. String items are sorted lexigraphically. Other data formats supported are: dates, 32- bit integers, 32-bit real numbers, case insensitive text. In addition, the list can be sorted by relevancy, hits, or catalog field entries in combination with metadata keys. Note that the RDOCHIT type is returned, an extension of DOCHIT that includes a separate relevancy count. (Which allows the hit count to be preserved.)
 See CatXSSearch for performing search.
 See catXSGetDocCount for maximum # hits N.

Parameters:
start - Indicates first item to access in the search set. Range 1..N, where N is the total number of search results in set.
nItems - The total number of items to retrieve. This number must be less then the remaining items in set, from start index.
attrs - An array of String giving the metadata attribute names by which to sort. Type information can be specified by setting tag semantics in the catalog handle or by following a naming convention. Special pseudo names also allow for sorting by relevancy, hits, or catalog field item.
     Naming conventions:
     
     -   If an attribute names starts with '-', then the sort
         order is reversed (descending). Can be used to precede
         other leading sequences or with semantic typing.
     t_  Attribute value is text, but ignore case differences.
     r_  Attribute value is 32-bit floating number.
     i_  Attribute value is 32-bit integer number.
     d_  Attribute value is date. If the catalog SRC2STF_PARMS
         sr_dateFormats list is set, these date formats will be 
         used.
     !REL
         A Pseudo-name for an attribute. This indicates that the
         sort key will be the relevancy number, not an actual meta-
         data attribute (e.g. !rel or !RELEVANCY both are acceptable)
         Note that MODE_RELEVANCY must be set in the prior search
         to get the relevancy metric else this will just sort by hit
         count.
     !HIT
         Sort by number of hits in the document.
     !FLD[n]
         A Pseudo-name for an attribute indicating catalog entry field.
         Note [n] is a 1 to n digit value indicating the catalog
         field number to sort. The field number must be in the range
         of catalog items (see CRIDS: Catalog Record Component IDs for values). Note that for
         the LAST field CAT_AUXINFO, the upper 16-bits of key can contain 
         a sub-field value (0..5) for the text components of this field.
         (Use 0xffff for sub-field to sort by entire item).
         Numeric values for the currently defined fileds are:
              CAT_FILENAME   = 0  // Catalog key, usually path of file.
              CAT_ATTRS      = 1  // Attributes (not useful for sort)
              CAT_DATE1      = 2; // Not used (not useful for sort)
              CAT_DATE2      = 3; // Not used (not useful for sort)
              CAT_DATE3      = 4; // Last Modified date.
              CAT_INFO       = 5; // URL Data
              CAT_FILETYPE   = 6; // Numeric file type (text, XML, PDF, etc.)
              CAT_FILEOFFSET = 7; // file offset (not useful for sort)
              CAT_FILESIZE   = 8; // Size of document object in bytes.
              CAT_AUXINFO    = 9; // Extra text. Composed of 6 sub-fields.
                                  // Can sort by entire field or by one
                                  // of the subfields:
                  0: TITLE, 1: ABSTRACT, 2: ENCODING, 3: ALT_TITLE
                  4: ADDED_TEXT, METADATA : 5, Entire field: 0xffff
                  -- Metadata should be sorted using metdata attribute 
                     names, not with sub-field 5! (e.g. "/doc/lmd/@date")
                  -- [n] can be decimal or hex digits (e.g. 0xffff0001)
     Notes.
     
     1. All other named attributes are assumed by default to be case 
        sensitive text data (Unless a semantic has been defined for
        the fully qualified tag). 
     2. catSetXMLSemantics() can be used to set the 'type'
        of a node. These will override the naming conventions
        above.
     3. A leading root '/' slash is ignored.
     
     Examples:
     
     attrs[0] = "local/@r_money" --> Sort by attr r_money
                                     assume type is float format.
     OR

     attrs[0] = "local/@d_date"  --> Primary sort by attr d_date,
                                     assume type is date format.
     attrs[1] = "local/@i_count" --> Secondary sort on int i_count.
                                     assume type is integer format.
     OR

     attrs[0] = "!REL"           --> Primary sort by relevancy.
                                     
     attrs[1] = "!FLD4"          --> Secondary sort by file write 
                                     (modified) date (from catalog).
     attrs[2] = "local/@i_count" --> Tertiary sort by int i_count.
                                     assume type is integer format.
Returns:
RDOCHIT array where each object contains document id and number of hits.
Throws:
CatalogSearchException - See constant definitions.

catXSHitList

public java.util.Vector catXSHitList(int[] aaList,
                                     int start,
                                     int nItems)
                              throws CatalogSearchException
API: Retrieve the hit list(word location) Vector resulting from a search. Each vector element is a char[] of the same length as the input aaList array. The aaList array parameter contains an index to the attribute array (see below for value) item to return. Each item of an attribute array is a 16-bit value. Attribute array information contains things like document number, paragraph number, word number, title level, context index, etc. of the word match.
      Attribute Array Index Values:

         STF_DOCUMENT        = 0;
         STF_DOCUMENT_HI     = 1;
         STF_DOCUMENT_PRIME  = 2;
         STF_PARAGRAPH       = 3;
         STF_PARAGRAPH_HI    = 4;
         STF_PARAGRAPH_PRIME = 5;
         STF_WORD            = 6;
         STF_TITLE_LEVELS    = 7;
         STF_ATTRIBUTE8      = 8;
         STF_ATTRIBUTE9      = 9;
         STF_ATTRIBUTE10     = 10;
         STF_ATTRIBUTE11     = 11;
         STF_ATTRIBUTE12     = 12;
         STF_ATTRIBUTE13     = 13;    // Context Instance (LO)
         STF_ATTRIBUTE14     = 14;    // Context Instance (HI)
         STF_ATTRIBUTE15     = 15;    // Context ID
         STF_FLAGS           = 16;
      ----------------------------

      Example call:

      aaList[0] = STF_TOKEN.STF_DOCUMENT;
      aaList[1] = STF_TOKEN.STF_DOCUMENT_HI;
      aaList[2] = STF_TOKEN.STF_PARAGRAPH;
      aaList[3] = STF_TOKEN.STF_PARAGRAPH_HI;
      aaList[4] = STF_TOKEN.STF_WORD;
      aaList[5] = STF_TOKEN.STF_ATTRIBUTE15;

      aaVector = catXSHitList(aaList, 1, N);
      aaHit    = (char []) aaResult.ElementAt(i)

 Then each hit object in return Vector would be a char[6] with 
 
      docId     = aaHit[0] + ((int) aaHit[1]) << 16);
      parNo     = aaHit[2] + ((int) aaHit[3]) << 16);
      wrdNo     = aaHit[4];
      contextId = aaHit[5];
 
 See CatXSSearch for performing search.
 See catXSGetDocCount for maximum # hits N.

 Note that the size of the result vector is in the extended error value
 for SUCCESSFUL operations.

Parameters:
aaList - Array of integer indexes specifying attribute array data to return.
start - Indicates which item in the search set with which to start. Range 1..N, where N is the total number of hit results in set.
nItems - The total number of items to retrieve.
Returns:
Vector of char[] contining corresponding attribute array values. relevancy # (return in hitCount field).
Throws:
CatalogSearchException - See constant definitions.

catXSWildLookup

public java.util.Vector catXSWildLookup(long docId,
                                        java.lang.String sep)
                                 throws CatalogSearchException
API: Retrieve a vector of String, each element containing the actual word match from a search query containing wildcards. See CatXSSearch for performing search. See catXSGetDocCount for maximum # hits N.

Parameters:
docId - Document id indicating the specific match for a wildcard.
sep - If null, just return matching sym (word). Else, return the wildcard string, the string "sep", and the actual match.
Returns:
Vector of String containing the wildcard matches, if any.
Throws:
CatalogSearchException - See constant definitions.

catXSWildLookup

public java.util.Vector[] catXSWildLookup(com.doclinx.jftr.DOCHIT[] dh,
                                          java.lang.String sep)
                                   throws CatalogSearchException
API: Retrieve a vector array of String, each vector element containing the actual word match from a search query containing wildcards one vector per document in the hashtable See CatXSSearch for performing search. See catXSGetDocCount for maximum # hits N.

Parameters:
dh - DOCHIT array of documents to match wildcard hits
sep - If null, just return matching sym (word). Else, return the wildcard string, the string "sep", and the actual match.
Returns:
Array of Vectors of Strings, one element per document hit.
Throws:
CatalogSearchException - See constant definitions.

catXSWildLookup

public void catXSWildLookup(java.util.Hashtable ht,
                            java.lang.String sep,
                            java.util.Vector[] v)
                     throws CatalogSearchException
API: Retrieve a vector array of String, each vector element containing the actual word match from a search query containing wildcards one vector per document in the hashtable See CatXSSearch for performing search. See catXSGetDocCount for maximum # hits N.

Parameters:
ht - Integer hashtable containing doc# to element in result array
sep - If null, just return matching sym (word). Else, return the wildcard string, the string "sep", and the actual match.
v - Array of Vectors, each vector contining String exact matches.
Throws:
CatalogSearchException - See constant definitions.

catXSContextXPath

public java.lang.String catXSContextXPath(int docId,
                                          char contextId)
                                   throws CatalogSearchException
API: Return the string denoting the XPATH where a match (hit) occurred.
      For example, obtain JUST the context ID for the ith hit:

      aaList[0] = STF_TOKEN.STF_DOCUMENT;
      aaList[1] = STF_TOKEN.STF_DOCUMENT_HI;
      aaList[2] = STF_ATTRIBUTE15;            // Context ID
      aaVec     = catXSHitList(aaList, 1, N);
      aaHit     = (char []) aaResult.ElementAt(i)
      docId     = aaHit[0] + ((int) aaHit[1]) << 16);
      contextId = aaHit[2];
 
      And then obtain the XPATH for that hit:

      String xpath = catXSContextXPath(contextId);
 
See CatXSSearch for performing search. See catXSHitList for obtaining the contextId of a hit.

Parameters:
docId - Document # from hitlist item returned by catXSHitList().
contextId - Context ID obtained from hitList item returned by catXSHitList().
Returns:
String containing the path expression where the hit occured.
Throws:
CatalogSearchException - See constant definitions.

catXSGetDocCount

public long catXSGetDocCount()
                      throws CatalogSearchException
API: Retrieve the number of documents found in the last search.
 See CatXSSearch for performing search.

Returns:
The total number of documents matching the query. Return 0 if last query had no matches.
Throws:
CatalogSearchException - See constant definitions.

catXSGetHitCount

public long catXSGetHitCount()
                      throws CatalogSearchException
API: Retrieve the total number of hits matching the last search query.
 See CatXSSearch for performing search.
 See catXSHitList for accessing the hit list.

Returns:
The total number of hits from last search, 0 if no matches.
Throws:
CatalogSearchException - See constant definitions.

catXSGetDocMax

public long catXSGetDocMax(int mode)
                    throws CatalogSearchException
API: Retrieve the maximum Document ID (catalog entry #) for the primary, update, or both databases.
 See MODE_SUBDELS for omitting deleted entries.
 See mode DB selects for specifying database(s).

Parameters:
mode - Control which maximum value to return.
Returns:
Maximum document ID based upon mode setting.
Throws:
CatalogSearchException - See constant definitions.

catXSGetDocMax

public long catXSGetDocMax()
                    throws CatalogSearchException
API: Retrieve the current active number of documents in the catalog.

Returns:
Number of searchable documents.
Throws:
CatalogSearchException - See constant definitions.

catXSGetSearchTime

public long catXSGetSearchTime()
                        throws CatalogSearchException
API: Retrieve time required for last search (in milleseconds).
 See CatXSSearch for performing search.

Returns:
Number of millesconds required for last search.
Throws:
CatalogSearchException - See constant definitions.

catXSGetExtendedError

public int catXSGetExtendedError()
                          throws CatalogSearchException
API: Return extended error code, refining errors for query or search.
 See QERR for query error codes when exception error is XS_RSQUERY.
 See RS_STATUS for all other extended codes 

Returns:
Extended error code from last CatalogSearchException.
Throws:
CatalogSearchException - See constant definitions. (returns only search system errors: XS_RSERROR, XS_SELECTERROR, XS_RSSEARCH).

catXSSetLogFile

public void catXSSetLogFile(java.lang.String sFileName,
                            int level)
                     throws CatalogSearchException
API: Set logging file and severity level for class error reporting.

Parameters:
sFileName - Fully qualified path name of log file. If null, just set the current level.
level - Debug severity level. See severity settings.
Throws:
CatalogSearchException - See constant definitions.

catXSSetLog

public void catXSSetLog(com.doclinx.jftr.Log sLog)
                 throws CatalogSearchException
API: Set logger.

Parameters:
sLog - Opened logger.
Throws:
CatalogSearchException - See constant definitions.

catXSGetCatalog

public com.doclinx.ftxml.CatalogManager catXSGetCatalog()
                                                 throws CatalogSearchException
API: Return CatalogManager handle associated with this search object.
 See CatalogManager for access methods.

Returns:
Read-only CatalogManager class object.
Throws:
CatalogSearchException - See constant definitions.

catXSFinish

public static void catXSFinish()
                        throws CatalogSearchException
API: Close all outstanding CatalogSearch objects that are OPEN. Note that this is a static method and should be called to conclude all search activity before exiting application.
 See CatalogSearch.catXSOpen method.

Throws:
CatalogSearchException - See constant definitions.

catXSGetHitsReturned

public int catXSGetHitsReturned()
                         throws CatalogSearchException
API: Returns the number of hits from last search AND then releases search result objects. Previous search results are now invalid.
 See catXSGetHitCount for more information.

Throws:
CatalogSearchException - See constant definitions.

DOC

public static int DOC(char doc,
                      char hiDoc)
API: Construct DocId from attribute array values.
 See catXSHitList for accessing the hit list.

Parameters:
doc - Low 16-bits of document ID
hiDoc - Upper 16-bits of document ID
Returns:
DocId created from 2 char values.

DOC

public static int DOC(char[] aaList)
API: Construct DocId from attribute array values. (assumed 0, 1!)
 See catXSHitList for accessing the hit list.

Parameters:
aaList - aa list with 1st to items containing DOC information.
Returns:
DocId created from 1st 2 aaList values;

catXSSetRelevancyTags

public void catXSSetRelevancyTags(int val,
                                  java.lang.String tag)
                           throws CatalogSearchException
API: Set multiplier for highly relevant tags (e.g. <title>). Allows application to set tag multiplier to influence relevancy feedback. Normally, multiplier is 1. The multiplier is logarithmic, not linear. Applications can set this value higher (max value suggested is 16). Any words that match a query and occur within specified tags will use the specified multiplier. A value of zero can be used to weight certain tags as non-applicable to relevancy valuations.
      See catXSSearch mode parameter to do relevancy search.
      See MODE_RELEVANCY mode for value to enable relevancy.
      See catXSGetRelevancyList for search results.

Parameters:
val - relevancy multiplier (0..16)
tag - XPath of tag expression or special value:
    "<TITLE>": If hit has attr array[7] set (title), use multipler. (default is 8)
                  This is used primarily for HTML or user specified titles in text docuemnts.
    "<PROX>" : If proximity used, then weight terms in proximity by multipler. (default is 2)
    "<EXACT>": If words in exact order as, weight with multiplier. (default is 4)
 
Throws:
CatalogSearchException - See constant definitions.

catXSIsObsolete

public boolean catXSIsObsolete()
                        throws CatalogSearchException
API: Checks to to if search catalog has been updated.

Returns:
true if catalog has been updated.
Throws:
CatalogSearchException - See constant definitions.

catXSFTRInfo

public com.doclinx.jftr.FTR catXSFTRInfo(com.doclinx.jftr.RS_DBH[] dbh)
                                  throws CatalogSearchException
API: Returns low-level FTR handle.

Parameters:
dbh - Database handles for primary and update
Returns:
FTR handle
Throws:
CatalogSearchException - See constant definitions.

catXSGetHF

public java.lang.String catXSGetHF(long docId)
                            throws CatalogSearchException
API: Create PDF Highlight File Format for a given document hit.

Parameters:
docId - The document to get the hit highlight XML.
Returns:
String with highlight information, null if doc not found.
Throws:
CatalogSearchException - See constant definitions.

catXSSetLocLimit

public static void catXSSetLocLimit(int limit)
API: Sets upper limit for unique words in wildcard or range seaches.
	(Adjust limit for BLM exception 2)

Parameters:
limit - New upper limit (default 16384)

catXSSerial

public void catXSSerial(com.doclinx.ftxml.XS_SYNC sync)
                 throws java.lang.Exception
java.lang.Exception

catXSGetAlts

public java.lang.String catXSGetAlts(int mode,
                                     java.lang.String symbol)
                              throws CatalogSearchException
API: Retrieve alternate word list via supported methods. These include thesaurus, fuzzy (spell), stemming, and plurals.
 NOTE: Any/all of the 4 lists can be combined together (e.g. plura/fuzzy).

Parameters:
mode - Specifies which DB and alternative list(s) to use. See MODE_PRM for start of discussion about search modes and options.
symbol - The word to find to create a set of alias words.
Returns:
Comma delimited list of alternative (alias) words.
Throws:
CatalogSearchException - See constant definitions.

catXSSpellSuggest

public java.util.Vector catXSSpellSuggest(int mode,
                                          java.lang.String symbol)
                                   throws CatalogSearchException
API: Retrieve alternate word list via supported methods. These include thesaurus, fuzzy (spell), stemming, and plurals.
 NOTE: Any/all of the 4 lists can be combined together (e.g. plura/fuzzy).

Parameters:
mode - Specifies which DB and alternative list(s) to use. See MODE_PRM for start of discussion about search modes and options.
symbol - The word to find to create a set of alias words.
Returns:
Vector of DictItem objects (word/count).
Throws:
CatalogSearchException - See constant definitions.

catXSSearchAbort

public void catXSSearchAbort()
API: Asynchronous abort for current search (running in another thread)


catXSClearSearch

public void catXSClearSearch(boolean reload)
API: Clears buffered search hits -- this is to reduce memory use after a search AND building of document search result list. After this point (for many applications), the internal results can be discarded. Note that if a call to a routine that requires these results is made and the reload parameter is set, then the search will be performed again.

Parameters:
reload - Redo search if required by another call else PERMANTENTLY discard search results.

catXSMaxDocList

public static void catXSMaxDocList(int maxSize)
API: Sets the maximum number of DOCHIT objects for in-memory storage. Lists larger than this number are virtualized.

Parameters:
maxSize - Maximum # of DOCHITs to hold in memory.